MACROPHONE: An American English Telephone Speech Corpus

نویسندگان

Kelsey Taussig

Jared Bernstein

چکیده

Macrophone is a corpus of approximately 200,000 utterances, recorded over the telephone from a broad sample of about 5,000 American speakers. Sponsored by the Linguistic Data Consortium (LDC), it is the first of a series of similar data sets that will be colected for major languages of the world in a cooperative project called Polyphone. It is designed to provide telephone speech suitable for the development of automatic voice-interactive telephone services. In particular, Maerophone contains training material for applications in transportation, scheduling, ticketing, database access, shopping, and other automated telephone interactions. In addition to being phonetically balanced, the spoken material refers to times, locations, monetary amounts, and interactive operations. The utterances are spoken by respondents into telephone handsets and recorded directly in 8-bit mu-law digital form through a T1 connection to the usual switched telephone network. The entire corpus will be made available by LDC in 1994. The paper describes the design of the linguistic materials in the corpus, and the process of solicitation, collection, transcription, and file preparation for the Macrophone corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

2000 Nist Evaluation of Conversational Speech Recognition over the Telephone: English and Mandarin Performance Results

This paper documents the use of conversational telephone speech test materials in the NIST coordinated evaluation conducted early in 2000. The primary evaluation was of General American English speech, but a subsidiary evaluation of Mandarin speech was also offered. The primary test data consisted of twenty conversations collected for the original Switchboard Corpus but not released with the pu...

متن کامل

Inter-digit HMM: connected digit recognition using the Macrophone corpus

Continuous digit recognition over the telephone channel is a key technology for many telecommuncations applications such as voice dialing, automatic banking, and credit card number entry. Speech recognizers usually acheive high performance by modeling the acoustics in Hidden Markov Models (HMMs) using large numbers of multivariate Gaussian mixtures with assumed diagonal covariance in order to m...

متن کامل

Vocabulary-independent recognition of american Spanish phrases and digit strings

We describe the development of an R&D recognizer for several Spanish applications, starting from an existing recognition system for American English and modest language-speci c resources. The experiments emphasize achieving phonetic accuracy on telephone speech without vocabulary speci c training. We use our basic recognition engine, and simple grammar-building tools for predicting word sequenc...

متن کامل

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...

متن کامل

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

MACROPHONE: An American English Telephone Speech Corpus

نویسندگان

چکیده

منابع مشابه

2000 Nist Evaluation of Conversational Speech Recognition over the Telephone: English and Mandarin Performance Results

Inter-digit HMM: connected digit recognition using the Macrophone corpus

Vocabulary-independent recognition of american Spanish phrases and digit strings

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

عنوان ژورنال:

اشتراک گذاری